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General Gap Analysis for Autonomic Networking 
Abstract 


This document provides a problem statement and general gap analysis 
for an IP-based Autonomic Network that is mainly based on distributed 
network devices. The document provides background by reviewing the 
current status of autonomic aspects of IP networks and the extent to 
which current network management depends on centralization and human 
administrators. Finally, the document outlines the general features 
that are missing from current network abilities and are needed in the 
ideal Autonomic Network concept. 


This document is a product of the IRTF’s Network Management Research 
Group. 


Status of This Memo 


This document is not an Internet Standards Track specification; it is 
published for informational purposes. 


This document is a product of the Internet Research Task Force 
(IRTF). The IRTF publishes the results of Internet-related research 
and development activities. These results might not be suitable for 
deployment. This RFC represents the consensus of the Network 
Management Research Group of the Internet Research Task Force (IRTF). 
Documents approved for publication by the IRSG are not a candidate 
for any level of Internet Standard; see Section 2 of RFC 5741. 


Information about the current status of this document, any errata, 
and how to provide feedback on it may be obtained at 
http://www.rfc-editor.org/info/rfc7576. 
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Introduction 


The general goals and relevant definitions for Autonomic Networking 
are discussed in [RFC7575]. In summary, the fundamental goal of an 
Autonomic Network is self-management, including self-configuration, 
self-optimization, self-healing, and self-protection. Whereas 
interior gateway routing protocols such as OSPF and IS-IS largely 
exhibit these properties, most other aspects of networking require 
top-down configuration, often involving human administrators anda 
considerable degree of centralization. In essence, Autonomic 
Networking is putting all network configurations onto the same 
footing as routing, limiting manual or database-driven configuration 
to an essential minimum. It should be noted that this is highly 
unlikely to eliminate the need for human administrators, because many 
of their essential tasks will remain. The idea is to eliminate 
tedious and error-prone tasks, for example, manual calculations, 
cross-checking between two different configuration files, or tedious 
data entry. Higher-level operational tasks, and complex 
troubleshooting, will remain to be done by humans. 


This document represents the consensus of the IRTF’s Network 
Management Research Group (NMRG). It first provides background by 
identifying examples of partial autonomic behavior in the Internet 
and by describing important areas of non-autonomic behavior. Based 
on these observations, it then describes missing general mechanisms 
that would allow autonomic behaviors to be added throughout the 
Internet. 


Terminology 
The terminology defined in [RFC7575] is used in this document. 
Automatic and Autonomic Aspects of Current IP Networks 


This section discusses the history and current status of automatic or 
autonomic operations in various aspects of network configuration, in 
order to establish a baseline for the gap analysis. In particular, 
routing protocols already contain elements of autonomic processes, 
such as information exchange and state synchronization. 


1. IP Address Management and DNS 


For many years, there was no alternative to completely manual and 
static management of IP addresses and their prefixes. Once a site 
had received an IPv4 address assignment (usually a Class C /24 or 
Class B /16, and rarely a Class A /8), it was a matter of paper-and- 
pencil design of the subnet plan (if relevant) and the addressing 
plan itself. Subnet prefixes were manually configured into routers, 
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and /32 addresses were assigned administratively to individual host 
computers and configured manually by system administrators. Records 
were typically kept in a plain text file or a simple spreadsheet. 


Clearly, this method was clumsy and error-prone as soon as a site had 
more than a few tens of hosts, but it had to be used until DHCP 
[RFC2131] became a viable solution during the second half of the 
1990s. DHCP made it possible to avoid manual configuration of 
individual hosts (except, in many deployments, for a small number of 
servers configured with static addresses). Even so, prefixes had to 
be manually assigned to subnets and their routers, and DHCP servers 
had to be configured accordingly. 


In terms of management, there is a linkage between IP address 
management and DNS management, because DNS mappings typically need to 
be appropriately synchronized with IP address assignments. At 
roughly the same time as DHCP came into widespread use, it became 
very laborious to manually maintain DNS source files in step with IP 
address assignments. Because of reverse DNS lookup, it also became 
necessary to synthesize DNS names even for hosts that only played the 
role of clients. Therefore, it became necessary to synchronize DHCP 
server tables with forward and reverse DNS. For this reason, IP 
address management tools emerged, as discussed for the case of 
renumbering in [RFC7010]. These are, however, centralized solutions 
that do not exhibit autonomic properties as defined in [RFC7575]. 


A related issue is prefix delegation, especially in IPv6 when more 
than one prefix may be delegated to the same physical subnet. DHCPv6 
Prefix Delegation [RFC3633] is a useful solution, but it requires 
specific configuration so cannot be considered autonomic. How this 
topic is to be handled in home networks is still in discussion 
[Pfister]. Still further away is autonomic assignment and delegation 
of routable IPv4 subnet prefixes. 


An IPv6 network needs several aspects of host address assignments to 
be configured. The network might use stateless address 
autoconfiguration [RFC4862] or DHCPv6 [RFC3315] in stateless or 
stateful modes, and there are various alternative forms of Interface 
Identifier [RFC7136]. 


Another feature is the possibility of Dynamic DNS Update [RFC2136]. 
With appropriate security, this is an automatic approach, where no 
human intervention is required to create the DNS records for a host 
after it acquires a new address. However, there are coexistence 
issues with a traditional DNS setup, as described in [RFC7010]. 
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3.2. Routing 


Since a very early stage, it has been a goal that Internet routing 
should be self-healing when there is a failure of some kind in the 
routing system (i.e., a link or a router goes wrong). Also, the 
problem of finding optimal routes through a network was identified 
many years ago as a problem in mathematical graph theory, for which 
well known algorithms were discovered (the Dijkstra and Bellman-Ford 
algorithms). Thus, routing protocols became largely autonomic from 
the start, as it was clear that manual configuration of routing 
tables for a large network was impractical. 


IGP routers do need some initial configuration data to start up the 
autonomic routing protocol. Also, BGP-4 routers need detailed static 
configuration of routing policy data. 


3.3. Configuration of Default Router in a Host 


Originally, the configuration of a default router in a host was a 
manual operation. Since the deployment of DHCP, this has been 
automatic as far as most IPv4 hosts are concerned, but the DHCP 
server must be appropriately configured. In simple environments such 
as a home network, the DHCP server resides in the same box as the 
default router, so this configuration is also automatic. In more 
complex environments, where an independent DHCP server or a local 
DHCP relay is used, DHCP configuration is more complex and not 
automatic. 


In IPv6 networks, the default router is provided by Router 
Advertisement messages [RFC4861] from the router itself, and all IPv6 
hosts make use of it. The router may also provide more complex Route 
Information Options. The process is essentially autonomic as far as 
all IPv6é hosts are concerned, and DHCPv6 is not involved. However, 
there are still open issues when more than one prefix is in use on a 
subnet, and more than one first-hop router may be available as a 
result (see, for example, [RFC6418]). 


3.4. Hostname Lookup 


Originally, hostnames were looked up in a static table, often 

referred to as "hosts.txt" from its traditional file name. When the 
DNS was deployed during the 1980s, all hosts needed DNS resolver code 
and needed to be configured with the IP addresses (not the names) of 


suitable DNS servers. Like the default router, these were originally 
manually configured. Today, they are provided automatically via DHCP 
or DHCPv6 [RFC3315]. For IPv6 end systems, there is also a way for 


them to be provided automatically via a Router Advertisement option. 
However, the DHCP or DHCPv6 server, or the IPv6 router, needs to be 
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configured with the appropriate DNS server addresses. Additionally, 
some networks deploy Multicast DNS [RFC6762] locally to provide 
additional automation of the name space. 


3.5. User Authentication and Accounting 


Originally, user authentication and accounting was mainly based on 
physical connectivity and the degree of trust that follows from 
direct connectivity. Network operators charged based on the setup of 
dedicated physical links with users. Automated user authentication 
was introduced by the Point-to-Point Protocol [RFC1661], [RFC1994] 
and RADIUS protocol [RFC2865] [RFC2866] in the early 1990s. As long 
as a user completes online authentication through the RADIUS 
protocol, the accounting for that user starts on the corresponding 
Authentication, Authorization, and Accounting (AAA) server 
automatically. This mechanism enables business models with charging 
based on the amount of traffic or time. However, user authentication 
information continues to be manually managed by network 
administrators. It also becomes complex in the case of mobile users 
who roam between operators, since prior relationships between the 
operators are needed. 


3.6. Security 


Security has many aspects that need configuration and are therefore 
candidates to become autonomic. On the other hand, it is essential 
that a network’s central policy be applied strictly for all security 
configurations. As a result, security has largely been based on 
centrally imposed configurations. 


Many aspects of security depend on policy, for example, password 
rules, privacy rules, firewall rulesets, intrusion detection and 
prevention settings, VPN configurations, and the choice of 
cryptographic algorithms. Policies are, by definition, human made 
and will therefore also persist in an autonomic environment. 
However, policies are becoming more high-level, abstracting 
addressing, for example, and focusing on the user or application. 
The methods to manage, distribute, and apply policy and to monitor 
compliance and violations could be autonomic. 


Today, many security mechanisms show some autonomic properties. For 
example user authentication via IEEE 802.1x allows automatic mapping 
of users after authentication into logical contexts (typically 


VLANs). While today configuration is still very important, the 
overall mechanism displays signs of self-adaption to changing 
situations. 
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BGP Flowspec [RFC5575] allows a partially autonomic threat-defense 
mechanism, where threats are identified, the flow information is 
automatically distributed, and counter-actions can be applied. 

Today, typically a human operator is still in the loop to check 
correctness, but over time such mechanisms can become more autonomic. 


Negotiation capabilities, present in many security protocols, also 
display simple autonomic behaviors. In this case, a security policy 
about algorithm strength can be configured into servers but will 
propagate automatically to clients. 


3.7. State Synchronization 


Another area where autonomic processes between peers are involved is 
state synchronization. In this case, several devices start out with 
inconsistent state and go through a peer-to-peer procedure after 
which their states are consistent. Many autonomic or automatic 
processes include some degree of implicit state synchronization. 
Network time synchronization [RFC5905] is a well-established explicit 
example, guaranteeing that a participating node’s clock state is 
synchronized with reliable time servers within a defined margin of 
error, without any overall point of control of the synchronization 
process. 


4. Current Non-autonomic Behaviors 


In current networks, many operations are still heavily dependent on 
human intelligence and decision, or on centralized top-down network 
management systems. These operations are the targets of Autonomic 
Networking technologies. The ultimate goal of Autonomic Networking 
is to replace human and automated operations by autonomic functions, 
so that the networks can run independently without depending on a 
human or Network Management System (NMS) for routine details, while 
maintaining central control where required. Of course, there would 
still be the absolute minimum of human input required, particularly 
during the network-building stage, emergencies, and difficult 
troubleshooting. 


This section analyzes the existing human and central dependencies in 
typical networks and suggests cases where they could, in principle, 
be replaced by autonomic behaviors. 


4.1. Building a New Network 
Building a network requires the operator to analyze the requirements 
of the new network, design a deployment architecture and topology, 


decide device locations and capacities, set up hardware, design 
network services, choose and enable required protocols, configure 
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each device and each protocol, set up central user authentication and 
accounting policies and databases, design and deploy security 
mechanisms, etc. 


Overall, these jobs are quite complex work that cannot become fully 
autonomic in the foreseeable future. However, part of these jobs may 
be able to become autonomic, such as detailed device and protocol 
configurations and database population. The initial network 
management policies/behaviors may also be transplanted from other 
networks and automatically localized. 


4.2. Network Maintenance and Management 


Network maintenance and management are very different for ISP 
networks and enterprise networks. ISP networks have to change much 
more frequently than enterprise networks, given the fact that ISP 
networks have to serve a large number of customers who have very 
diversified requirements. The current rigid model is that network 
administrators design a limited number of services for customers to 
order. New requirements of network services may not be able to be 
met quickly by human management. Given a real-time request, the 
response must be autonomic, in order to be flexible and quickly 
deployed. However, behind the interface, describing abstracted 
network information and user authorization management may have to 
depend on human intelligence from network administrators in the 
foreseeable future. User identification integration/consolidation 
among networks or network services is another challenge for Autonomic 
Network access. Currently, many end users have to manually manage 
their user accounts and authentication information when they switch 
among networks or network services. 


Classical network maintenance and management mainly handle the 
configuration of network devices. Tools have been developed to 
enable remote management and make such management easier. However, 
the decision about each configuration detail depends either on human 
intelligence or rigid templates. Therefore, these are the sources of 
all network configuration errors -- the human was wrong, the template 
was wrong, or both were wrong. This is also a barrier to increasing 
the utility of network resources, because the human managers cannot 
respond quickly enough to network events, such as traffic bursts, 
that were not foreseen in the template. For example, currently, a 
light load is often assumed in network design because there is no 
mechanism to properly handle a sudden traffic flood. It is therefore 
common to avoid performance collapses caused by traffic overload by 
configuring idle resources, with an overprovisioning ratio of at 
least 2 being normal [Xiao02]. 
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There are grounds for concern that the introduction of new, more 
flexible, methods of network configuration, typified by Software- 
Defined Networking (SDN), will only make the management problem more 
complex unless the details are managed automatically or 
autonomically. There is no doubt that SDN creates both the necessity 
and the opportunity for automation of configuration management, e.g., 
[Kim13]. This topic is discussed from a service provider viewpoint 
in [RFC7149]. 


Autonomic decision processes for configuration would enable dynamic 
management of network resources (by managing resource-relevant 
configuration). Self-adapting network configuration would adjust the 
network into the best possible situation; this would prevent 
configuration errors from having lasting impact. 


4.3. Security Setup 


Setting up security for a network generally requires very detailed 
human intervention or relies entirely on default configurations that 
may be too strict or too risky for the particular situation of the 
network. While some aspects of security are intrinsically top-down 
in nature (e.g., broadcasting a specific security policy to all 
hosts), others could be self-managed within the network. 


In an Autonomic Network, where nodes within a domain have a mutually 
verifiable domain identity, security processes could run entirely 
automatically. Nodes could identify each other securely, negotiating 
required security settings and even shared keys if needed. The 
locations of the trust anchors (certificate authority, registration 
authority), certificate revocation lists, policy server, etc., can be 
found by service discovery. Transactions such as a download of a 
certificate revocation list can be authenticated via a common trust 
anchor. Policy distribution can also be entirely automated and 
secured via a common trust anchor. 


These concepts lead to a network where the intrinsic security is 
automatic and applied by default, i.e., a "self-protecting" network. 
For further discussion, see [Behringer]. 


4.4. Troubleshooting and Recovery 


Current networks suffer difficulties in locating the cause of network 
failures. Although network devices may issue many warnings while 
running, most of them are not sufficiently precise to be identified 
as errors. Some of them are early warnings that would not develop 
into real errors. Others are, in effect, random noise. During a 
major failure, many different devices will issue multiple warnings 
within a short time, causing overload for the NMS and the operators. 
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However, for many scenarios, human experience is still vital to 
identify real issues and locate them. This situation may be improved 
by automatically associating warnings from multiple network devices 
together. Also, introducing automated learning techniques (comparing 
current warnings with historical relationships between warnings and 
actual faults) could increase the possibility and success rate of 
Autonomic Network diagnoses and troubleshooting. 


Depending on the network errors, some of them (particularly hardware 
failures) may always require human intervention. However, Autonomic 
Network management behavior may help to reduce the impact of errors, 
for example, by switching traffic flows around. Today, this is 
usually manual (except for classical routing updates). Fixing 
software failures and configuration errors currently depends on 
humans and may even involve rolling back software versions and 
rebooting hardware. Such problems could be autonomically corrected 
if there were diagnostics and recovery functions defined in advance 
for them. This would fulfill the concept of self-healing. 


Another possible autonomic function is predicting device failures or 
overloads before they occur. A device could predict its own failure 
and warn its neighbors, or a device could predict its neighbor’s 
failure. In either case, an Autonomic Network could respond as if 
the failure had already occurred by routing around the problem and 
reporting the failure, with no disturbance to users. The criteria 
for predicting failure could be temperature, battery status, bit 
error rates, etc. The criteria for predicting overload could be 
increasing load factor, latency, jitter, congestion loss, etc. 


5. Features Needed by Autonomic Networks 


There are innumerable properties of network devices and end systems 
that today need to be configured either manually, by scripting, or by 
using a management protocol such as the Network Configuration 
Protocol (NETCONF) [RFC6241]. In an Autonomic Network, all of these 
would need to either have satisfactory default values or be 
configured automatically. Some examples are parameters for tunnels 
of various kinds, flows (in an SDN context), quality of service, 
service function chaining, energy management, system identification, 
and NTP configuration, but the list is endless. 


The task of Autonomic Networking is to incrementally build up 
individual autonomic processes that could progressively be combined 
to respond to every type of network event. Building on the preceding 
background information, and on the reference model in [RFC7575], this 
section outlines the gaps and missing features in general terms and, 
in some cases, mentions general design principles that should apply. 
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5.1. More Coordination among Devices or Network Partitions 


Network services are dependent on a number of devices and parameters 
to be in place in a certain order. For example, after a power 
failure, a coordinated sequence of "return to normal" operations is 
desirable (e.g., switches and routers first, DNS servers second, 
etc.). Today, the correct sequence of events is either known only by 
a human administrator or automated in a central script. Ina truly 
Autonomic Network, elements should understand their dependencies and 
be able to resolve them locally. 


In order to make right or good decisions autonomically, the network 
devices need to know more information than just reachability 
(routing) information from the relevant or neighbor devices. Devices 
must be able to derive, for themselves, the dependencies between such 
information and configurations. 


There are therefore increased requirements for horizontal information 
exchange in the networks. Particularly, three types of interaction 
among peer network devices are needed for autonomic decisions: 
discovery (to find neighbors and peers), synchronization (to agree on 
network status), and negotiation (when things need to be changed). 
Thus, there is a need for reusable discovery, synchronization, and 
negotiation mechanisms that would support the discovery of many 
different types of device, the synchronization of many types of 
parameter, and the negotiation of many different types of objective. 


5.2. Reusable Common Components 


Elements of autonomic functions already exist today, within many 
different protocols. However, all such functions have their own 
discovery, transport, messaging, and security mechanisms as well as 
non-autonomic management interfaces. Each protocol has its own 
version of the above-mentioned functions to serve specific and narrow 
purposes. It is often difficult to extend an existing protocol to 
serve different purposes. Therefore, in order to provide the 
reusable discovery, synchronization, and negotiation mechanisms 
mentioned above, it is desirable to develop a set of reusable common 
protocol components for Autonomic Networking. These components 
should be: 


o Able to identify other devices, users, and processes securely. 


o Able to automatically secure operations, based on the above 
identity scheme. 


o Able to manage any type of information and information flows. 
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o Able to discover peer devices and services for various Autonomic 
Service Agents (or autonomic functions). 


o Able to support closed-loop operations when needed to provide 
self-managing functions involving more than one device. 


Oo Separable from the specific Autonomic Service Agents (or autonomic 
functions). 


o Reusable by other autonomic functions. 
5.3. Secure Control Plane 


The common components will, in effect, act as a control plane for 
autonomic operations. This control plane might be implemented in- 
band as functions of the target network, in an overlay network, or 
even out-of-band in a separate network. Autonomic operations will be 
capable of changing how the network operates and allocating resources 
without human intervention or knowledge, so it is essential that they 
are secure. Therefore, the control plane must be designed to be 
secure against forged autonomic operations and man-in-the middle 
attacks and as secure as reasonably possible against denial-of- 


service attacks. It must be decided whether the control plane needs 
to be resistant to unwanted monitoring, i.e., whether encryption is 
required. 

5.4. Less Configuration 


Many existing protocols have been defined to be as flexible as 
possible. Consequently, these protocols need numerous initial 


configurations to start operations. There are choices and options 
that are irrelevant in any particular case, some of which target 
corner cases. Furthermore, in protocols that have existed for years, 
some design considerations are no longer relevant, since the 
underlying hardware technologies have evolved meanwhile. To 
appreciate the scale of this problem, consider that more than 160 
DHCP options have been defined for IPv4. Even sample router 
configuration files readily available online contain more than 200 
lines of commands. There is therefore considerable scope for 


simplifying the operational tools for configuration of common 
protocols, even if the underlying protocols themselves cannot be 
simplified. 


From another perspective, the deep reason why human decisions are 
often needed mainly results from the lack of information. When a 
device can collect enough information horizontally from other 
devices, it should be able to decide many parameters by itself, 
instead of receiving them from top-down configuration. 
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It is desired that top-down management is reduced in Autonomic 
Networking. Ideally, only the abstract Intent is needed from the 
human administrators. Neither users nor administrators should need 
to create and maintain detailed policies and profiles; if they are 
needed, they should be built autonomically. The local parameters 
should be decided by distributed Autonomic Nodes themselves, either 
from historic knowledge, analytics of current conditions, closed 
logical decision loops, or a combination of all. 


5.5. Forecasting and Dry Runs 


In a conventional network, there is no mechanism for trying something 
out safely, which means that configuration changes have to be 
designed in the abstract and their probable effects have to be 
estimated theoretically. In principle, an alternative to this would 
be to test the changes on a complete and realistic network simulator. 
However, this is a practical impossibility for a large network that 
is constantly changing, even if an accurate simulation could be 
performed. There is therefore a risk that applying changes to a 
running network will cause a failure of some kind. An autonomic 
network could fill this gap by supporting a closed loop "dry run" 
mode in which each configuration change could be tested out 
dynamically in the control plane without actually affecting the data 
plane. If the results are satisfactory, the change could be made 
live on the running network. If there is a consistency problem such 
as overcommitment of resources or incompatibility with another 
configuration setting, the change could be rolled back dynamically 
with no impact on traffic or users. 


5.6. Benefit from Knowledge 


The more knowledge and experience we have, the better decisions we 
can make. It is the same for networks and network management. When 
one component in the network lacks knowledge that affects what it 
should do, and another component has that knowledge, we usually rely 
on a human operator or a centralized management tool to convey the 
knowledge. 


Up to now, the only available network knowledge is usually the 
current network status inside a given device or relevant current 
status from other devices. 


However, historic knowledge is very helpful to make correct 
decisions, in particular, to reduce network oscillation or to manage 
network resources over time. Transplantable knowledge from other 
networks can be helpful to initially set up a new network or new 
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network devices. Knowledge of relationships between network events 
and network configuration may help a network to decide the best 
parameters according to real performance feedback. 


In addition to such historic knowledge, powerful data analytics of 
current network conditions may also be a valuable source of knowledge 
that can be exploited directly by Autonomic Nodes. 


6. Security Considerations 


This document is focused on what is missing to allow autonomic 
network configuration, including security settings, of course. 
Therefore, it does not itself create any new security issues. It is 
worth underlining that autonomic technology must be designed with 
strong security properties from the start, since a network with 
vulnerable autonomic functions would be at great risk. 
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